Overview

Dataset statistics

Number of variables14
Number of observations1006969
Missing cells3899090
Missing cells (%)27.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory531.4 MiB
Average record size in memory553.4 B

Variable types

CAT9
NUM4
BOOL1

Warnings

0 has a high cardinality: 689577 distinct values High cardinality
1 has a high cardinality: 96 distinct values High cardinality
2 has a high cardinality: 10762 distinct values High cardinality
5 has a high cardinality: 429 distinct values High cardinality
9 has a high cardinality: 5946 distinct values High cardinality
11 has a high cardinality: 953 distinct values High cardinality
12 is highly correlated with 4High correlation
4 is highly correlated with 12High correlation
5 has 432330 (42.9%) missing values Missing
6 has 412392 (41.0%) missing values Missing
7 has 884361 (87.8%) missing values Missing
8 has 809190 (80.4%) missing values Missing
9 has 397969 (39.5%) missing values Missing
10 has 665817 (66.1%) missing values Missing
11 has 12569 (1.2%) missing values Missing
12 has 283479 (28.2%) missing values Missing
0 is uniformly distributed Uniform

Reproduction

Analysis started2020-10-20 16:12:18.573393
Analysis finished2020-10-20 16:13:29.627599
Duration1 minute and 11.05 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

0
Categorical

HIGH CARDINALITY
UNIFORM

Distinct689577
Distinct (%)68.5%
Missing0
Missing (%)0.0%
Memory size7.7 MiB
6EA085C0-77B
 
8
2ACBAF2D-B98
 
7
5C630692-5A0
 
7
FACAEBD8-D25
 
7
99B5AB0A-552
 
7
Other values (689572)
1006933 
ValueCountFrequency (%) 
6EA085C0-77B8< 0.1%
 
2ACBAF2D-B987< 0.1%
 
5C630692-5A07< 0.1%
 
FACAEBD8-D257< 0.1%
 
99B5AB0A-5527< 0.1%
 
5D80D33A-D387< 0.1%
 
C16ABB0F-9D87< 0.1%
 
F16A60BF-7667< 0.1%
 
C9F216D1-E7F7< 0.1%
 
D3C536BB-3347< 0.1%
 
0C9EC72C-E0F7< 0.1%
 
92F48A9A-EDF7< 0.1%
 
C6CFF284-EA47< 0.1%
 
581F89A8-3F57< 0.1%
 
EC0D0155-8307< 0.1%
 
3BE8CDFE-C8D7< 0.1%
 
EBC7E41C-A657< 0.1%
 
D9E4543F-B0F7< 0.1%
 
8A48BCF6-3967< 0.1%
 
5A7E685F-85A7< 0.1%
 
D824AB28-F677< 0.1%
 
90D63221-F017< 0.1%
 
DE81176F-2E47< 0.1%
 
7E22B768-CE67< 0.1%
 
A55120A0-7937< 0.1%
 
Other values (689552)1006793> 99.9%
 
2020-10-20T13:13:37.626720image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique458912 ?
Unique (%)45.6%
2020-10-20T13:13:37.861711image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length12
Median length12
Mean length12
Min length12

Overview of Unicode Properties

Unique unicode characters17
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
-10069698.3%
 
36938775.7%
 
56936265.7%
 
F6934925.7%
 
66934505.7%
 
D6932025.7%
 
86931635.7%
 
A6931305.7%
 
06928075.7%
 
16925385.7%
 
E6920195.7%
 
26918935.7%
 
C6918485.7%
 
76914555.7%
 
B6907855.7%
 
46897645.7%
 
96896105.7%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number692218357.3%
 
Uppercase Letter415447634.4%
 
Dash Punctuation10069698.3%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
369387710.0%
 
569362610.0%
 
669345010.0%
 
869316310.0%
 
069280710.0%
 
169253810.0%
 
269189310.0%
 
769145510.0%
 
468976410.0%
 
968961010.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
F69349216.7%
 
D69320216.7%
 
A69313016.7%
 
E69201916.7%
 
C69184816.7%
 
B69078516.6%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-1006969100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common792915265.6%
 
Latin415447634.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
-100696912.7%
 
36938778.8%
 
56936268.7%
 
66934508.7%
 
86931638.7%
 
06928078.7%
 
16925388.7%
 
26918938.7%
 
76914558.7%
 
46897648.7%
 
96896108.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
F69349216.7%
 
D69320216.7%
 
A69313016.7%
 
E69201916.7%
 
C69184816.7%
 
B69078516.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII12083628100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
-10069698.3%
 
36938775.7%
 
56936265.7%
 
F6934925.7%
 
66934505.7%
 
D6932025.7%
 
86931635.7%
 
A6931305.7%
 
06928075.7%
 
16925385.7%
 
E6920195.7%
 
26918935.7%
 
C6918485.7%
 
76914555.7%
 
B6907855.7%
 
46897645.7%
 
96896105.7%
 

1
Categorical

HIGH CARDINALITY

Distinct96
Distinct (%)< 0.1%
Missing695
Missing (%)0.1%
Memory size7.7 MiB
HYUNDAI
106758 
NISSAN
83746 
KIA MOTORS
76742 
CHEVROLET
76139 
TOYOTA
76086 
Other values (91)
586803 
ValueCountFrequency (%) 
HYUNDAI10675810.6%
 
NISSAN837468.3%
 
KIA MOTORS767427.6%
 
CHEVROLET761397.6%
 
TOYOTA760867.6%
 
SUZUKI682046.8%
 
FORD570785.7%
 
JEEP484564.8%
 
SUBARU475294.7%
 
MAZDA367793.7%
 
SSANGYONG319323.2%
 
MITSUBISHI241042.4%
 
RENAULT220022.2%
 
DODGE216552.2%
 
PEUGEOT215352.1%
 
GREAT WALL212262.1%
 
CHERY207382.1%
 
HONDA199132.0%
 
MAHINDRA190471.9%
 
KIA142211.4%
 
JAC120491.2%
 
BMW115231.1%
 
MERCEDES BENZ81260.8%
 
VOLVO76350.8%
 
DAIHATSU76330.8%
 
Other values (71)654186.5%
 
2020-10-20T13:13:38.118655image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique8 ?
Unique (%)< 0.1%
2020-10-20T13:13:38.321223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length17
Median length6
Mean length6.657539606
Min length2

Overview of Unicode Properties

Unique unicode characters28
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
A70649010.5%
 
O5795628.6%
 
S5029357.5%
 
I4730897.1%
 
N4450676.6%
 
E4450156.6%
 
U4206276.3%
 
T4136626.2%
 
R3739035.6%
 
D3186294.8%
 
H2862774.3%
 
Y2459663.7%
 
M1833192.7%
 
L1768442.6%
 
K1674692.5%
 
G1441572.2%
 
C1344732.0%
 
Z1207601.8%
 
1138631.7%
 
V1069021.6%
 
B972161.5%
 
P738961.1%
 
F682491.0%
 
J606910.9%
 
W407640.6%
 
Other values (3)41110.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter658798898.3%
 
Space Separator1138631.7%
 
Lowercase Letter2085< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A70649010.7%
 
O5795628.8%
 
S5029357.6%
 
I4730897.2%
 
N4450676.8%
 
E4450156.8%
 
U4206276.4%
 
T4136626.3%
 
R3739035.7%
 
D3186294.8%
 
H2862774.3%
 
Y2459663.7%
 
M1833192.8%
 
L1768442.7%
 
K1674692.5%
 
G1441572.2%
 
C1344732.0%
 
Z1207601.8%
 
V1069021.6%
 
B972161.5%
 
P738961.1%
 
F682491.0%
 
J606910.9%
 
W407640.6%
 
X2026< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
113863100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n139066.7%
 
a69533.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin659007398.3%
 
Common1138631.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A70649010.7%
 
O5795628.8%
 
S5029357.6%
 
I4730897.2%
 
N4450676.8%
 
E4450156.8%
 
U4206276.4%
 
T4136626.3%
 
R3739035.7%
 
D3186294.8%
 
H2862774.3%
 
Y2459663.7%
 
M1833192.8%
 
L1768442.7%
 
K1674692.5%
 
G1441572.2%
 
C1344732.0%
 
Z1207601.8%
 
V1069021.6%
 
B972161.5%
 
P738961.1%
 
F682491.0%
 
J606910.9%
 
W407640.6%
 
X2026< 0.1%
 
Other values (2)2085< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
113863100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII6703936100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
A70649010.5%
 
O5795628.6%
 
S5029357.5%
 
I4730897.1%
 
N4450676.6%
 
E4450156.6%
 
U4206276.3%
 
T4136626.2%
 
R3739035.6%
 
D3186294.8%
 
H2862774.3%
 
Y2459663.7%
 
M1833192.7%
 
L1768442.6%
 
K1674692.5%
 
G1441572.2%
 
C1344732.0%
 
Z1207601.8%
 
1138631.7%
 
V1069021.6%
 
B972161.5%
 
P738961.1%
 
F682491.0%
 
J606910.9%
 
W407640.6%
 
Other values (3)41110.1%
 

2
Categorical

HIGH CARDINALITY

Distinct10762
Distinct (%)1.1%
Missing288
Missing (%)< 0.1%
Memory size7.7 MiB
RAV 4
 
18619
QASHQAI
 
17309
NEW TUCSON GL 2.0
 
13662
TUCSON
 
12610
SPORTAGE
 
11480
Other values (10757)
933001 
ValueCountFrequency (%) 
RAV 4186191.8%
 
QASHQAI173091.7%
 
NEW TUCSON GL 2.0136621.4%
 
TUCSON126101.3%
 
SPORTAGE114801.1%
 
SANTA FE107411.1%
 
SPORTAGE LX 2.0106401.1%
 
QASHQAI 1.6102601.0%
 
RAV 4 2.088280.9%
 
GRAND NOMADE87690.9%
 
SANTA FE GLS 2.485030.8%
 
TIGGO 1.673180.7%
 
ECOSPORT73110.7%
 
OPTRA LS 1.670730.7%
 
RAV4 2.470030.7%
 
CX 569440.7%
 
HAVAL H3 LE 2.062940.6%
 
X TRAIL61280.6%
 
CAPTIVA60260.6%
 
RAV4 2.4 AUT57810.6%
 
EXPLORER56400.6%
 
ORLANDO LS 2.055440.6%
 
TUCSON GL 2.054470.5%
 
SORENTO53200.5%
 
SCORPIO GLX 2.252390.5%
 
Other values (10737)78819278.3%
 
2020-10-20T13:13:38.586632image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2311 ?
Unique (%)0.2%
2020-10-20T13:13:38.803247image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length35
Median length14
Mean length13.99234733
Min length1

Overview of Unicode Properties

Unique unicode characters42
Unique unicode categories6 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
207455214.7%
 
A11184747.9%
 
T8541606.1%
 
R7813165.5%
 
E7776225.5%
 
O6404344.5%
 
.6302044.5%
 
S6162994.4%
 
L5611924.0%
 
N5496193.9%
 
24943443.5%
 
44466543.2%
 
I4247803.0%
 
X4240313.0%
 
U4178603.0%
 
D3650372.6%
 
C3485862.5%
 
G3265762.3%
 
02911582.1%
 
P2679031.9%
 
V2504911.8%
 
51681651.2%
 
W1446161.0%
 
11420781.0%
 
H1403211.0%
 
Other values (17)8333885.9%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter953201567.7%
 
Space Separator207455214.7%
 
Decimal Number185191513.1%
 
Other Punctuation6303224.5%
 
Lowercase Letter864< 0.1%
 
Dash Punctuation192< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A111847411.7%
 
T8541609.0%
 
R7813168.2%
 
E7776228.2%
 
O6404346.7%
 
S6162996.5%
 
L5611925.9%
 
N5496195.8%
 
I4247804.5%
 
X4240314.4%
 
U4178604.4%
 
D3650373.8%
 
C3485863.7%
 
G3265763.4%
 
P2679032.8%
 
V2504912.6%
 
W1446161.5%
 
H1403211.5%
 
M1216821.3%
 
F1078271.1%
 
Q1039221.1%
 
K769160.8%
 
Y465800.5%
 
B362950.4%
 
J235140.2%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
2074552100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
249434426.7%
 
444665424.1%
 
029115815.7%
 
51681659.1%
 
11420787.7%
 
31239226.7%
 
61164326.3%
 
8345031.9%
 
7274501.5%
 
972090.4%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.630204> 99.9%
 
,118< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n57666.7%
 
a28833.3%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-192100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin953287967.7%
 
Common455698132.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A111847411.7%
 
T8541609.0%
 
R7813168.2%
 
E7776228.2%
 
O6404346.7%
 
S6162996.5%
 
L5611925.9%
 
N5496195.8%
 
I4247804.5%
 
X4240314.4%
 
U4178604.4%
 
D3650373.8%
 
C3485863.7%
 
G3265763.4%
 
P2679032.8%
 
V2504912.6%
 
W1446161.5%
 
H1403211.5%
 
M1216821.3%
 
F1078271.1%
 
Q1039221.1%
 
K769160.8%
 
Y465800.5%
 
B362950.4%
 
J235140.2%
 
Other values (3)68260.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
207455245.5%
 
.63020413.8%
 
249434410.8%
 
44466549.8%
 
02911586.4%
 
51681653.7%
 
11420783.1%
 
31239222.7%
 
61164322.6%
 
8345030.8%
 
7274500.6%
 
972090.2%
 
-192< 0.1%
 
,118< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII14089860100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
207455214.7%
 
A11184747.9%
 
T8541606.1%
 
R7813165.5%
 
E7776225.5%
 
O6404344.5%
 
.6302044.5%
 
S6162994.4%
 
L5611924.0%
 
N5496193.9%
 
24943443.5%
 
44466543.2%
 
I4247803.0%
 
X4240313.0%
 
U4178603.0%
 
D3650372.6%
 
C3485862.5%
 
G3265762.3%
 
02911582.1%
 
P2679031.9%
 
V2504911.8%
 
51681651.2%
 
W1446161.0%
 
11420781.0%
 
H1403211.0%
 
Other values (17)8333885.9%
 

3
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013.540322
Minimum2010
Maximum2018
Zeros0
Zeros (%)0.0%
Memory size7.7 MiB
2020-10-20T13:13:39.013544image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2010
5-th percentile2010
Q12012
median2013
Q32015
95-th percentile2017
Maximum2018
Range8
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.352165235
Coefficient of variation (CV)0.001168173892
Kurtosis-1.063867656
Mean2013.540322
Median Absolute Deviation (MAD)2
Skewness0.1871796021
Sum2027572685
Variance5.532681292
MonotocityNot monotonic
2020-10-20T13:13:39.189295image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
201314409914.3%
 
201113679613.6%
 
201213345613.3%
 
201413343513.3%
 
201711063211.0%
 
201010825010.8%
 
201510397010.3%
 
2016948369.4%
 
2018414954.1%
 
ValueCountFrequency (%) 
201010825010.8%
 
201113679613.6%
 
201213345613.3%
 
201314409914.3%
 
201413343513.3%
 
201510397010.3%
 
2016948369.4%
 
201711063211.0%
 
2018414954.1%
 
ValueCountFrequency (%) 
2018414954.1%
 
201711063211.0%
 
2016948369.4%
 
201510397010.3%
 
201413343513.3%
 
201314409914.3%
 
201213345613.3%
 
201113679613.6%
 
201010825010.8%
 

4
Real number (ℝ≥0)

HIGH CORRELATION

Distinct754428
Distinct (%)74.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56500042.79
Minimum0
Maximum329820688
Zeros1526
Zeros (%)0.2%
Memory size7.7 MiB
2020-10-20T13:13:39.903511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile18052498.6
Q129023307
median40627585
Q350742038
95-th percentile253184865.8
Maximum329820688
Range329820688
Interquartile range (IQR)21718731

Descriptive statistics

Standard deviation64398525.67
Coefficient of variation (CV)1.139796051
Kurtosis8.169974409
Mean56500042.79
Median Absolute Deviation (MAD)10764804
Skewness3.055284087
Sum5.689379158e+13
Variance4.147170108e+15
MonotocityNot monotonic
2020-10-20T13:13:40.122552image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
32122493129160.3%
 
26375499527670.3%
 
32133089816320.2%
 
32123155415410.2%
 
015260.2%
 
25738556513540.1%
 
26348438211070.1%
 
3214766029000.1%
 
3213176528700.1%
 
2003632398420.1%
 
3195900637950.1%
 
3117442196680.1%
 
3201480116220.1%
 
2808357076000.1%
 
2644316255790.1%
 
2748369965740.1%
 
3209750155690.1%
 
2517827595560.1%
 
3212878495300.1%
 
320941139500< 0.1%
 
320552406459< 0.1%
 
303005274451< 0.1%
 
307664499440< 0.1%
 
321248111388< 0.1%
 
282056974387< 0.1%
 
Other values (754403)98339697.7%
 
ValueCountFrequency (%) 
015260.2%
 
7781< 0.1%
 
566722< 0.1%
 
747891< 0.1%
 
1187652< 0.1%
 
1500551< 0.1%
 
3003851< 0.1%
 
3438882< 0.1%
 
5473911< 0.1%
 
7379451< 0.1%
 
ValueCountFrequency (%) 
3298206881< 0.1%
 
3298193641< 0.1%
 
32981883423< 0.1%
 
32981866823< 0.1%
 
3298176081< 0.1%
 
3298171456< 0.1%
 
3298171121< 0.1%
 
3298170791< 0.1%
 
3298164493< 0.1%
 
3298159201< 0.1%
 

5
Categorical

HIGH CARDINALITY
MISSING

Distinct429
Distinct (%)0.1%
Missing432330
Missing (%)42.9%
Memory size7.7 MiB
LAS CONDES
45799 
 
33432
SANTIAGO
 
22283
PROVIDENCIA
 
20560
MAIPU
 
18269
Other values (424)
434296 
ValueCountFrequency (%) 
LAS CONDES457994.5%
 
334323.3%
 
SANTIAGO222832.2%
 
PROVIDENCIA205602.0%
 
MAIPU182691.8%
 
ANTOFAGASTA176261.8%
 
VITACURA167521.7%
 
NUNOA166861.7%
 
LA FLORIDA163441.6%
 
VINA DEL MAR142351.4%
 
PUENTE ALTO137271.4%
 
LO BARNECHEA132141.3%
 
CONCEPCION118701.2%
 
CALAMA95961.0%
 
LA REINA93320.9%
 
PENALOLEN92560.9%
 
TEMUCO89430.9%
 
RANCAGUA88570.9%
 
LA SERENA79910.8%
 
PUDAHUEL70150.7%
 
VALPARAISO66990.7%
 
SAN BERNARDO64830.6%
 
QUILICURA64830.6%
 
COPIAPO62420.6%
 
SAN MIGUEL57780.6%
 
Other values (404)22116722.0%
 
(Missing)43233042.9%
 
2020-10-20T13:13:40.369524image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique39 ?
Unique (%)< 0.1%
2020-10-20T13:13:40.604482image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length20
Median length5
Mean length6.168077667
Min length1

Overview of Unicode Properties

Unique unicode characters29
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n86466013.9%
 
A80393612.9%
 
a4323307.0%
 
N4269836.9%
 
O3889716.3%
 
E3812516.1%
 
L3604305.8%
 
I3100515.0%
 
C2930134.7%
 
2792324.5%
 
R2468594.0%
 
S2323143.7%
 
U2234873.6%
 
T1944193.1%
 
P1588622.6%
 
D1499992.4%
 
M973931.6%
 
V815481.3%
 
G781431.3%
 
H605691.0%
 
Q443460.7%
 
B411270.7%
 
F406870.7%
 
J90700.1%
 
Z60060.1%
 
Other values (4)53770.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter463464074.6%
 
Lowercase Letter129699020.9%
 
Space Separator2792324.5%
 
Dash Punctuation199< 0.1%
 
Other Punctuation2< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A80393617.3%
 
N4269839.2%
 
O3889718.4%
 
E3812518.2%
 
L3604307.8%
 
I3100516.7%
 
C2930136.3%
 
R2468595.3%
 
S2323145.0%
 
U2234874.8%
 
T1944194.2%
 
P1588623.4%
 
D1499993.2%
 
M973932.1%
 
V815481.8%
 
G781431.7%
 
H605691.3%
 
Q443461.0%
 
B411270.9%
 
F406870.9%
 
J90700.2%
 
Z60060.1%
 
Y51700.1%
 
K6< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
279232100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n86466066.7%
 
a43233033.3%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-199100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
"2100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin593163095.5%
 
Common2794334.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n86466014.6%
 
A80393613.6%
 
a4323307.3%
 
N4269837.2%
 
O3889716.6%
 
E3812516.4%
 
L3604306.1%
 
I3100515.2%
 
C2930134.9%
 
R2468594.2%
 
S2323143.9%
 
U2234873.8%
 
T1944193.3%
 
P1588622.7%
 
D1499992.5%
 
M973931.6%
 
V815481.4%
 
G781431.3%
 
H605691.0%
 
Q443460.7%
 
B411270.7%
 
F406870.7%
 
J90700.2%
 
Z60060.1%
 
Y51700.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
27923299.9%
 
-1990.1%
 
"2< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII6211063100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n86466013.9%
 
A80393612.9%
 
a4323307.0%
 
N4269836.9%
 
O3889716.3%
 
E3812516.1%
 
L3604305.8%
 
I3100515.0%
 
C2930134.7%
 
2792324.5%
 
R2468594.0%
 
S2323143.7%
 
U2234873.6%
 
T1944193.1%
 
P1588622.6%
 
D1499992.4%
 
M973931.6%
 
V815481.3%
 
G781431.3%
 
H605691.0%
 
Q443460.7%
 
B411270.7%
 
F406870.7%
 
J90700.1%
 
Z60060.1%
 
Other values (4)53770.1%
 

6
Categorical

MISSING

Distinct41
Distinct (%)< 0.1%
Missing412392
Missing (%)41.0%
Memory size7.7 MiB
13
152997 
METROPOLITANA DE SANTIAGO
149221 
33432 
DE VALPARAISO
22367 
DEL BIO BIO
 
21470
Other values (36)
215090 
ValueCountFrequency (%) 
1315299715.2%
 
METROPOLITANA DE SANTIAGO14922114.8%
 
334323.3%
 
DE VALPARAISO223672.2%
 
DEL BIO BIO214702.1%
 
0199512.0%
 
DE ANTOFAGASTA151401.5%
 
5137911.4%
 
05124211.2%
 
8121121.2%
 
08120071.2%
 
DEL LIBERTADOR BERNARDO OHIGGINS89190.9%
 
286900.9%
 
DE COQUIMBO85750.9%
 
1083230.8%
 
DE LA ARAUCANIA70830.7%
 
DEL MAULE67970.7%
 
DE LOS LAGOS65320.6%
 
0254240.5%
 
DE ATACAMA51130.5%
 
451040.5%
 
0650850.5%
 
650050.5%
 
0948670.5%
 
744950.4%
 
Other values (16)396563.9%
 
(Missing)41239241.0%
 
2020-10-20T13:13:40.905946image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-20T13:13:41.141460image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length33
Median length3
Mean length7.089148723
Min length1

Overview of Unicode Properties

Unique unicode characters36
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
A85053311.9%
 
n82478411.6%
 
O5930638.3%
 
5129527.2%
 
T5012767.0%
 
E4441716.2%
 
I4175235.8%
 
a4123925.8%
 
N3500294.9%
 
D2795573.9%
 
L2580863.6%
 
R2267773.2%
 
S2190633.1%
 
G1922052.7%
 
P1755892.5%
 
11739172.4%
 
M1723562.4%
 
31578792.2%
 
0799261.1%
 
B701771.0%
 
C320000.4%
 
5274930.4%
 
8241190.3%
 
U224550.3%
 
V223670.3%
 
Other values (11)978641.4%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter486680968.2%
 
Lowercase Letter123828017.3%
 
Decimal Number5205127.3%
 
Space Separator5129527.2%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
117391733.4%
 
315787930.3%
 
07992615.4%
 
5274935.3%
 
8241194.6%
 
2173323.3%
 
4119852.3%
 
6100901.9%
 
990241.7%
 
787471.7%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A85053317.5%
 
O59306312.2%
 
T50127610.3%
 
E4441719.1%
 
I4175238.6%
 
N3500297.2%
 
D2795575.7%
 
L2580865.3%
 
R2267774.7%
 
S2190634.5%
 
G1922053.9%
 
P1755893.6%
 
M1723563.5%
 
B701771.4%
 
C320000.7%
 
U224550.5%
 
V223670.5%
 
F151400.3%
 
H115690.2%
 
Q85750.2%
 
Y34740.1%
 
Z824< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
512952100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n82478466.6%
 
a41239233.3%
 
y11040.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin610508985.5%
 
Common103346414.5%
 

Most frequent Common characters

ValueCountFrequency (%) 
51295249.6%
 
117391716.8%
 
315787915.3%
 
0799267.7%
 
5274932.7%
 
8241192.3%
 
2173321.7%
 
4119851.2%
 
6100901.0%
 
990240.9%
 
787470.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A85053313.9%
 
n82478413.5%
 
O5930639.7%
 
T5012768.2%
 
E4441717.3%
 
I4175236.8%
 
a4123926.8%
 
N3500295.7%
 
D2795574.6%
 
L2580864.2%
 
R2267773.7%
 
S2190633.6%
 
G1922053.1%
 
P1755892.9%
 
M1723562.8%
 
B701771.1%
 
C320000.5%
 
U224550.4%
 
V223670.4%
 
F151400.2%
 
H115690.2%
 
Q85750.1%
 
Y34740.1%
 
y1104< 0.1%
 
Z824< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII7138553100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
A85053311.9%
 
n82478411.6%
 
O5930638.3%
 
5129527.2%
 
T5012767.0%
 
E4441716.2%
 
I4175235.8%
 
a4123925.8%
 
N3500294.9%
 
D2795573.9%
 
L2580863.6%
 
R2267773.2%
 
S2190633.1%
 
G1922052.7%
 
P1755892.5%
 
11739172.4%
 
M1723562.4%
 
31578792.2%
 
0799261.1%
 
B701771.0%
 
C320000.4%
 
5274930.4%
 
8241190.3%
 
U224550.3%
 
V223670.3%
 
Other values (11)978641.4%
 

7
Categorical

MISSING

Distinct2
Distinct (%)< 0.1%
Missing884361
Missing (%)87.8%
Memory size7.7 MiB
M
67439 
F
55169 
ValueCountFrequency (%) 
M674396.7%
 
F551695.5%
 
(Missing)88436187.8%
 
2020-10-20T13:13:41.316836image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-20T13:13:41.421918image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:41.558523image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.756481083
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n176872263.7%
 
a88436131.9%
 
M674392.4%
 
F551692.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter265308395.6%
 
Uppercase Letter1226084.4%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M6743955.0%
 
F5516945.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n176872266.7%
 
a88436133.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2775691100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n176872263.7%
 
a88436131.9%
 
M674392.4%
 
F551692.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2775691100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n176872263.7%
 
a88436131.9%
 
M674392.4%
 
F551692.0%
 

8
Boolean

MISSING

Distinct1
Distinct (%)< 0.1%
Missing809190
Missing (%)80.4%
Memory size7.7 MiB
0
197779 
(Missing)
809190 
ValueCountFrequency (%) 
019777919.6%
 
(Missing)80919080.4%
 
2020-10-20T13:13:41.653000image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

9
Categorical

HIGH CARDINALITY
MISSING

Distinct5946
Distinct (%)1.0%
Missing397969
Missing (%)39.5%
Memory size7.7 MiB
0
270604 
10601111,0
 
3796
11947500,0
 
3289
7390000
 
2540
13473846,0
 
2344
Other values (5941)
326427 
ValueCountFrequency (%) 
027060426.9%
 
10601111,037960.4%
 
11947500,032890.3%
 
739000025400.3%
 
13473846,023440.2%
 
573000023240.2%
 
10665000,020760.2%
 
10070000,020570.2%
 
12154444,020120.2%
 
1292000018100.2%
 
531000017380.2%
 
1044000017340.2%
 
16897308,015630.2%
 
803000015110.2%
 
976000014130.1%
 
910000013890.1%
 
1049000013850.1%
 
944000013790.1%
 
794000012790.1%
 
7601667,012330.1%
 
647000012050.1%
 
12391667,011830.1%
 
791000011610.1%
 
10230000,011580.1%
 
1117600011110.1%
 
Other values (5921)29570629.4%
 
(Missing)39796939.5%
 
2020-10-20T13:13:41.812620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique820 ?
Unique (%)0.1%
2020-10-20T13:13:42.145501image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length11
Median length3
Mean length4.24823505
Min length1

Overview of Unicode Properties

Unique unicode characters14
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
0141798433.1%
 
n79593818.6%
 
a3979699.3%
 
12719076.4%
 
62002004.7%
 
51711674.0%
 
71579273.7%
 
31511353.5%
 
91344533.1%
 
81316153.1%
 
41315893.1%
 
21211132.8%
 
974222.3%
 
,974222.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number288909067.5%
 
Lowercase Letter119390727.9%
 
Space Separator974222.3%
 
Other Punctuation974222.3%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
97422100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0141798449.1%
 
12719079.4%
 
62002006.9%
 
51711675.9%
 
71579275.5%
 
31511355.2%
 
91344534.7%
 
81316154.6%
 
41315894.6%
 
21211134.2%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,97422100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n79593866.7%
 
a39796933.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common308393472.1%
 
Latin119390727.9%
 

Most frequent Common characters

ValueCountFrequency (%) 
0141798446.0%
 
12719078.8%
 
62002006.5%
 
51711675.6%
 
71579275.1%
 
31511354.9%
 
91344534.4%
 
81316154.3%
 
41315894.3%
 
21211133.9%
 
974223.2%
 
,974223.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n79593866.7%
 
a39796933.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4277841100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
0141798433.1%
 
n79593818.6%
 
a3979699.3%
 
12719076.4%
 
62002004.7%
 
51711674.0%
 
71579273.7%
 
31511353.5%
 
91344533.1%
 
81316153.1%
 
41315893.1%
 
21211132.8%
 
974222.3%
 
,974222.3%
 

10
Real number (ℝ≥0)

MISSING

Distinct2363
Distinct (%)0.7%
Missing665817
Missing (%)66.1%
Infinite0
Infinite (%)0.0%
Mean20151803.6
Minimum20091015
Maximum20180111
Zeros0
Zeros (%)0.0%
Memory size7.7 MiB
2020-10-20T13:13:42.325928image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum20091015
5-th percentile20120613
Q120140704
median20151202
Q320170116
95-th percentile20171107
Maximum20180111
Range89096
Interquartile range (IQR)29412

Descriptive statistics

Standard deviation16814.17847
Coefficient of variation (CV)0.0008343758607
Kurtosis-0.3480813467
Mean20151803.6
Median Absolute Deviation (MAD)10596
Skewness-0.6546990626
Sum6.874828101e+12
Variance282716597.7
MonotocityNot monotonic
2020-10-20T13:13:42.525855image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
201511097100.1%
 
201511106850.1%
 
201511116820.1%
 
201511166670.1%
 
201511136120.1%
 
201511125920.1%
 
201511175440.1%
 
201711075040.1%
 
20161227501< 0.1%
 
20171206500< 0.1%
 
20151202499< 0.1%
 
20151119491< 0.1%
 
20171213485< 0.1%
 
20171221481< 0.1%
 
20151123475< 0.1%
 
20161229475< 0.1%
 
20171205471< 0.1%
 
20151118469< 0.1%
 
20171026456< 0.1%
 
20171025452< 0.1%
 
20171220450< 0.1%
 
20161230448< 0.1%
 
20171219445< 0.1%
 
20171031445< 0.1%
 
20151120442< 0.1%
 
Other values (2338)32817132.6%
 
(Missing)66581766.1%
 
ValueCountFrequency (%) 
200910151< 0.1%
 
200910281< 0.1%
 
200911021< 0.1%
 
200911191< 0.1%
 
200911202< 0.1%
 
200911262< 0.1%
 
200911271< 0.1%
 
200911302< 0.1%
 
200912101< 0.1%
 
200912111< 0.1%
 
ValueCountFrequency (%) 
20180111224< 0.1%
 
20180110249< 0.1%
 
20180109296< 0.1%
 
20180108318< 0.1%
 
2018010630< 0.1%
 
20180105248< 0.1%
 
20180104215< 0.1%
 
20180103210< 0.1%
 
20180102166< 0.1%
 
20171229238< 0.1%
 

11
Categorical

HIGH CARDINALITY
MISSING

Distinct953
Distinct (%)0.1%
Missing12569
Missing (%)1.2%
Memory size7.7 MiB
GRIS
231524 
BLANCO
213744 
PLATEADO
205961 
NEGRO
123425 
ROJO
63273 
Other values (948)
156473 
ValueCountFrequency (%) 
GRIS23152423.0%
 
BLANCO21374421.2%
 
PLATEADO20596120.5%
 
NEGRO12342512.3%
 
ROJO632736.3%
 
AZUL446024.4%
 
BEIGE172731.7%
 
CAFE110081.1%
 
DORADO109401.1%
 
VERDE75670.8%
 
NARANJO48870.5%
 
CELESTE40390.4%
 
GRIS GRAFITO28760.3%
 
GRIS OSCURO METAL24750.2%
 
PLATEADO PLATA23120.2%
 
GRIS OSCURO22140.2%
 
GRIS PLATA21710.2%
 
BLANCO PERLA21030.2%
 
PLATEADO METALICO19590.2%
 
NEGRO METALICO19130.2%
 
BLANCO PERLADO17030.2%
 
PLATEADO PLATA ME16650.2%
 
PLATEADO PLATA BR14060.1%
 
GRIS METALICO9800.1%
 
GRIS ACERO9770.1%
 
Other values (928)314033.1%
 
(Missing)125691.2%
 
2020-10-20T13:13:42.722574image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique377 ?
Unique (%)< 0.1%
2020-10-20T13:13:42.965233image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length20
Median length5
Mean length5.969200641
Min length3

Overview of Unicode Properties

Unique unicode characters30
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
A80488013.4%
 
O77404112.9%
 
L5327378.9%
 
R5008558.3%
 
E4604297.7%
 
G4016846.7%
 
N3746246.2%
 
I2961994.9%
 
S2648094.4%
 
C2613884.3%
 
T2593624.3%
 
D2560884.3%
 
B2460744.1%
 
P2363073.9%
 
J717781.2%
 
710631.2%
 
U565000.9%
 
Z487070.8%
 
n251380.4%
 
M218340.4%
 
F163710.3%
 
a125690.2%
 
V120590.2%
 
H2233< 0.1%
 
Y1235< 0.1%
 
Other values (5)1836< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter590200798.2%
 
Space Separator710631.2%
 
Lowercase Letter377070.6%
 
Other Punctuation23< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A80488013.6%
 
O77404113.1%
 
L5327379.0%
 
R5008558.5%
 
E4604297.8%
 
G4016846.8%
 
N3746246.3%
 
I2961995.0%
 
S2648094.5%
 
C2613884.4%
 
T2593624.4%
 
D2560884.3%
 
B2460744.2%
 
P2363074.0%
 
J717781.2%
 
U565001.0%
 
Z487070.8%
 
M218340.4%
 
F163710.3%
 
V120590.2%
 
H2233< 0.1%
 
Y1235< 0.1%
 
X925< 0.1%
 
K495< 0.1%
 
W382< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n2513866.7%
 
a1256933.3%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
71063100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
@23100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin593971498.8%
 
Common710861.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A80488013.6%
 
O77404113.0%
 
L5327379.0%
 
R5008558.4%
 
E4604297.8%
 
G4016846.8%
 
N3746246.3%
 
I2961995.0%
 
S2648094.5%
 
C2613884.4%
 
T2593624.4%
 
D2560884.3%
 
B2460744.1%
 
P2363074.0%
 
J717781.2%
 
U565001.0%
 
Z487070.8%
 
n251380.4%
 
M218340.4%
 
F163710.3%
 
a125690.2%
 
V120590.2%
 
H2233< 0.1%
 
Y1235< 0.1%
 
X925< 0.1%
 
Other values (3)888< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
71063> 99.9%
 
@23< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII6010800100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
A80488013.4%
 
O77404112.9%
 
L5327378.9%
 
R5008558.3%
 
E4604297.7%
 
G4016846.7%
 
N3746246.2%
 
I2961994.9%
 
S2648094.4%
 
C2613884.3%
 
T2593624.3%
 
D2560884.3%
 
B2460744.1%
 
P2363073.9%
 
J717781.2%
 
710631.2%
 
U565000.9%
 
Z487070.8%
 
n251380.4%
 
M218340.4%
 
F163710.3%
 
a125690.2%
 
V120590.2%
 
H2233< 0.1%
 
Y1235< 0.1%
 
Other values (5)1836< 0.1%
 

12
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct88
Distinct (%)< 0.1%
Missing283479
Missing (%)28.2%
Infinite0
Infinite (%)0.0%
Mean48.93003912
Minimum3
Maximum134
Zeros0
Zeros (%)0.0%
Memory size7.7 MiB
2020-10-20T13:13:43.145227image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile31
Q139
median48
Q358
95-th percentile71
Maximum134
Range131
Interquartile range (IQR)19

Descriptive statistics

Standard deviation12.58779248
Coefficient of variation (CV)0.2572610344
Kurtosis-0.4217535604
Mean48.93003912
Median Absolute Deviation (MAD)9
Skewness0.3395678924
Sum35400394
Variance158.4525195
MonotocityNot monotonic
2020-10-20T13:13:43.384907image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
44222342.2%
 
43220472.2%
 
42218832.2%
 
39213742.1%
 
41211452.1%
 
49205532.0%
 
48204622.0%
 
45204522.0%
 
40203502.0%
 
38196902.0%
 
47196682.0%
 
46193381.9%
 
50191391.9%
 
51188581.9%
 
52185431.8%
 
37182181.8%
 
35178791.8%
 
53178471.8%
 
54177481.8%
 
55174001.7%
 
36171721.7%
 
56169831.7%
 
34169441.7%
 
33152631.5%
 
57151911.5%
 
Other values (63)24710924.5%
 
(Missing)28347928.2%
 
ValueCountFrequency (%) 
384< 0.1%
 
161< 0.1%
 
179< 0.1%
 
18196< 0.1%
 
195360.1%
 
207740.1%
 
218290.1%
 
227120.1%
 
239350.1%
 
245220.1%
 
ValueCountFrequency (%) 
1341< 0.1%
 
1031< 0.1%
 
1002< 0.1%
 
994< 0.1%
 
986< 0.1%
 
978< 0.1%
 
9610< 0.1%
 
9512< 0.1%
 
9421< 0.1%
 
9351< 0.1%
 

13
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.7 MiB
S
688992 
N
317977 
ValueCountFrequency (%) 
S68899268.4%
 
N31797731.6%
 
2020-10-20T13:13:43.563058image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-20T13:13:43.656937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:43.762677image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
S68899268.4%
 
N31797731.6%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter1006969100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S68899268.4%
 
N31797731.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1006969100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
S68899268.4%
 
N31797731.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1006969100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
S68899268.4%
 
N31797731.6%
 

Interactions

2020-10-20T13:13:08.094434image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:08.413398image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:08.676988image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:08.881928image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:09.120656image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:09.341158image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:09.559834image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:09.773755image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:09.969279image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:10.179141image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:10.402244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:10.602069image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:10.793708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:10.997277image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:11.210228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:11.399358image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-10-20T13:13:43.902229image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-10-20T13:13:44.089787image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-10-20T13:13:44.260492image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-10-20T13:13:44.442933image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-10-20T13:13:44.610224image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-10-20T13:13:15.432119image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:18.057704image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:26.385220image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-20T13:13:27.499456image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

012345678910111213
0393A4B4C-085TOYOTARAV 4201550838335TEMUCO09MNaN11947500,0NaNBLANCONaNN
1B1F3DB7E-F67MAZDACX 5201646322649VILLA ALEMANA05MNaNNaNNaNNEGRO38.0S
2F6DBB2E6-A76GREAT WALLHAVAL NEW H3 2.0201436226810ANTOFAGASTADE ANTOFAGASTANaNNaNNaNNaNNEGRO49.0S
3EBCF63CD-99DJEEPCOMPASS SPORT 2.4201343482783NaNNaNNaNNaNNaN20160418.0GRIS40.0S
4FF38B368-B0FSUBARUFORESTER 2.0201725657273NaNNaNNaNNaNNaNNaNGRIS62.0S
5BC985B84-711KIA MOTORSNEW SORENTO EX 2.2201314210133CONCON5NaNNaN0NaNBLANCO64.0S
6D88B232D-2CBHONDAPILOT EXL 4X4 3.5 AUT201421198752RANCAGUADEL LIBERTADOR BERNARDO OHIGGINSNaNNaNNaNNaNCAFE63.0S
7AC1A5003-5B6KIA MOTORSNEW CARENS LX 1.7201424699069PROVIDENCIAMETROPOLITANA DE SANTIAGONaNNaNNaN20141205.0BLANCO61.0N
8A9F4BE58-817DODGEDURANGO SLT 4X4 5.7201132883814QUILICURAMETROPOLITANA DE SANTIAGONaN0.0020120215.0BLANCO50.0N
9CA31F279-2F4MERCEDES BENZML350 BLUE TEC201423348761LOS ANGELESDEL BIO BIONaNNaNNaN20140801.0GRIS56.0N

Last rows

012345678910111213
1006959B9F82C01-701KIA MOTORSSORENTO EX 2.4 AT201444517740QUILICURAMETROPOLITANA DE SANTIAGONaNNaNNaNNaNPLATEADO40.0S
100696005CEFF93-C13GREAT WALLHAVAL H3 LE 2.0201344315049CHIGUAYANTE8NaNNaN9185400NaNBLANCO TITANIONaNN
1006961F1EFF2AB-8ACTOYOTARAV 4 4X4 2.4201034484279NaNNaNNaNNaNNaNNaNNaN50.0N
1006962F1EFF2AB-8ACTOYOTARAV 4 4X4 2.4201034970491QUINTA NORMALMETROPOLITANA DE SANTIAGONaN0.0020111124.0BLANCO49.0N
1006963EAC42959-3FCTOYOTARAV 4 4X4 2.02017321224931NaNNaNNaNNaNNaNNaNPLATEADONaNS
10069643C64A6BD-180CHEVROLETTRACKER LT 1.8201643286592NaNNaNNaNNaN11176000NaNAZULNaNS
1006965CAF45D98-12DRENAULTDUSTER2016321224931VICUNA04NaNNaNNaNNaNGRISNaNS
1006966684771E1-3B8HONDACRV EX 4X4 2.4 AUT201143881138MAIPUMETROPOLITANA DE SANTIAGONaN0.011930000NaNBLANCO40.0S
1006967B150A94C-313TOYOTARAV4 2.4 AUT201026216251LAS CONDES13NaNNaN729000020130213.0BEIGE58.0S
1006968784B181C-F6BSSANGYONGREXTON 2.7201139201551NUNOAMETROPOLITANA DE SANTIAGONaN0.00NaNPLATEADO45.0S